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CN \ Abstract 

p/| This text is based on a lecture for the Sheffield Probability Day; its main purpose is to 

Ph . survey some recent asymptotic results [3, 4] about Bernoulli bond percolation on certain 

i -S3 , large random trees with logarithmic height. We also provide a general criterion for the 

existence of giant percolation clusters in large trees, which answers a question raised by 
David Croydon. 
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1 Introduction 



It is well-known that percolation is considerably simpler to study on a tree than on a general 
graph, thanks to the property of uniqueness of the path connecting two vertices. We refer 
in particular to Chapter 5 of [9] and references therein for a number of important and useful 
results for infinite trees, such as criteria for the existence or absence of infinite percolation 
clusters. Here, we shall be interested in a somewhat different type of questions. Specifically we 
consider a tree of large but finite size, perform a Bernoulli bond percolation with a parameter 
that depends on the size of that tree, and our purpose is to investigate the asymptotic behavior 
of the sizes of the largest clusters for appropriate regimes when the size of the tree goes to 
infinity. 

Our motivation comes from a celebrated result of Erdos and Renyi on the random graph 
model, which can be phrased informally as follows. With high probability when n ^> 1, 
Bernoulli bond percolation on the complete graph with n vertices and with parameter p(n) ~ 
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c/n for some fixed c > 1, produces a single giant cluster of size close to 9(c)n, where 9(c) G (0, 1) 
is some known constant, while the second, third, etc. largest clusters are almost microscopic, 
and more precisely have size only of order Inn. 

In the first part of this text, we provide a simple characterization of tree families and per- 
colation regimes which yield giant clusters, answering a question raised by David Croydon. In 
the second part, we review briefly the main results of [3, 4] concerning two natural families 
of random trees with logarithmic heights, namely recursive trees and scale-free trees. In those 
works, we show that the next largest clusters are almost giant, in the sense that their sizes are 
of order n/\nn, and obtain precise limit theorems in terms of certain Poisson random mea- 
sures. A common feature in the analysis of percolation for these models is that, even though 
one addresses a static problem, it is useful to consider dynamical versions in which edges are 
removed, respectively vertices are inserted, one after the other in certain order as time passes. 

2 Giant clusters 

We first introduce notations and hypotheses which will have an important role in this section. 
For a given integer n, we consider a set of n + 1 vertices, say V„ = {0, 1, . . . , n}, and a tree 
structure T n on V n . So T n has n edges, and we should think of as the root of T n . We perform 
a Bernoulli bond percolation on T n with parameter p(n), so that each edge of T n is kept with 
probability p(n) and removed with probability 1 —p(n), independently of the other edges. The 
resulting connected components are then referred to as clusters. 

We write for the size of the cluster that contains the root; plainly < n + 1. We say 
that Cp(„) is giant if ra -1 C°( n ) converges in law to some random variable G ^ 0, which should 
be thought of as the asymptotic proportion of vertices pertaining to the root cluster. David 
Croydon raised the question of finding a simple criterion for to be giant, depending of 

course on the nature of T n and regimes of the percolation parameter p(n); this motivates the 
following. 

For each fixed n G N, we denote by Vi, V2, ■ ■ ■ a sequence of i.i.d. vertices in V n with the 
uniform distribution. Next, for every k G N, we write L^ n for the length of the tree T n reduced 
to Vi, . . . , 14 and the root 0, i.e. the minimal number of edges of T n which are needed to connect 
and Vi, . . . , Vfe. In particular, L^ n should be thought of as the height of a typical vertex in 
T n . Let £ : N — > 1R + be some function with lim n ^ 00 £(n) = 00. We introduce the hypothesis 



where =>■ means weak convergence and L k is some random variable with values in R + . We 
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stress that (Hk) can be assumed to hold for different values of k and then only convergence in 
the sense of one-dimensional distributions is involved. 

In several examples, the function £ is a logarithm and L k = ak with a a positive constant. 
For instance, this happens for some important families of random trees, such as recursive trees, 
binary search trees, etc.; see [5]. Aldous [1] considered a different class of examples, including 
the case when T n is a Cayley tree of size n + 1 (i.e. a tree picked uniformly at random amongst 
the (n + l)™ -1 trees on V n ), for which it is known that (Hk) holds with £(n) = y/n and L k a 
chi- variable with 2k degrees of freedom. 

We now state the central result of this section. 
Theorem 1 For an arbitrary c > 0, consider the regime 

p(n) = l-^ + o(l/€(n)). (1) 

(i) If (Hk) holds for every k G N, then we have in the regime (1) 

n- 1 ^, =>■ G(c) , (2) 
where G(c) ^ is a random variable whose law is determined by its entire moments: 

E(G(cf) = E(e" cLfc ) , keN. (3) 
In particular lim c ^ +C(c) — 1 in probability. 

(ii) Conversely, suppose that for every c > 0, (2) holds in the regime (1) for some random 
variable G(c) with values in [0, 1]. Suppose further that lim c _ i , 0+ G(c) = 1 in probability. Then 
(H k ) is fulfilled for every k > 1, with L k a nonnegative random variable whose Laplace transform 
is given by (3). 

Proof: The proof relies on the observation that for each k > 1, there is the identity 

®(((n + l)- 1 C° p(n) ) k )=E(p(n) L ^). (4) 

Indeed, recall that V±, . . . , 14 are k i.i.d. uniformly distributed vertices, which are independent of 
the percolation process. This enables us to interpret the left-hand side of (4) as the probability 
that Vi, . . . ,V k belong to the percolation cluster containing the root. On the other hand, 
considering the tree reduced to V±, . . . , V k and the root shows that this same probability can 
also be expressed in terms of the length L k , n of this reduced tree, as the right-hand side of (4). 
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The assumption (H k ) entails that in the regime (1), 

lim E (p(n) L ^) = lim E (exp f --^L M ) ) = E( e - cLfc ) , 

n->-oo rt->oo ^ ^ t(n) J J 

and then we deduce from (4) that 

Jim E (((n + l)" 1 ^)*) = E(e~^) . 

Thus, if (H k ) holds for every k G N, then (n + 1) _1 C°^ converges in law to some variable 
G(c) with values in [0, 1]. More precisely, the law of G(c) is determined by its entire moments 
E(G{cf) = E(e~ cLfc ) > 0; this proves (2). 

Conversely, as the size of a cluster cannot exceed n + 1, (2) implies that in the regime (1), 
we have for every integer k > 1 

JimE(((n+l)- 1 C' p (ri) ) fe )=E(G(c) fe ). 

From (4), we rewrite this as 

lim E (p{n) L ^) = E{G(c) k ) . 

n— s-oo 

Plugging the expression (1) for the parameter p(n), we easily derive that 

i^ E ( exp ("i) Lt "')) =E(G(c)i) - 

Recall the assumption that lim c ^o+ G(c) = 1 in probability, in particular lim c ^o+ E(G{c) k ) = 1. 
We conclude from Theorem XIII. 1.2 in Feller [7] on page 431, that for each k > 1, the function 
c i — y E(G(c) k ) is the Laplace transform of a random variable L k > 0, and that (H k ) holds. □ 

We next point at an interesting consequence of Theorem 1 to the characterization of the cases 
for which the proportion of vertices in the root cluster converges in probability to a constant. 
We consider the situation where the variables L k appearing in Hypotheses (H k ) are of the form 

= & + + & (H' k ) 
where £i, . . . is a sequence of i.i.d. variables in R + . 

Corollary 1 (i) Suppose that (H k ) and (H' k ) hold for k = 1,2. Then in the regime (1), we 
have 

lim n^C^n) = 9(c) in probability , (5) 
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where 6(c) = E(e" cfl ) > 0. Further (H k ) and (H' k ) hold for every k > 1. 

(ii) Conversely, suppose that for every c > 0, (5) holds in the regime (1) for some function 
9 : [0, oo) — > [0, 1] such that lim c ^ 0+ 9(c) = 1. Then 9 is the Laplace transform of a nonnegative 
random variable and (H k ) and (H' k ) are fulfilled for every k > 1 with £i, . . . a sequence of 
i.i.d. copies o/£. 

Proof: When (H' k ) holds, we have E(exp(— cL k )) = 9(c) k , with 9(c) = E(e _c?1 ). We now see 
from the proof of Theorem 1 that Hypotheses (Hk) and (H' k ) entail that in the regime (1), we 
have 

Y T J.{{n-^l {n) ) k )=9(cf. 
In particular, if (Hk) and (H k ) hold for k = 1,2, then 

hm E((n- 1 qj (n) - 9(c)) 2 ) =0, 

which proves (5). 

Conversely, if (5) holds, then we can apply Theorem 1 (ii) with G(c) = 9(c). In particular 
we know that (H k ) holds for all k G N. Further, we get that E(e" cLfc ) = 9(c) k , which in turn 
shows that (H' k ) is fulfilled. □ 

We now conclude this section by pointing at a simple criterion which ensures that the cluster 
containing the root is the unique giant component. 

Proposition 1 In the preceding notation, assume that there is the joint weak convergence 

J^j(Ll, n , L 2 , n ) (Li,L 2 ), 

where (L ± , L 2 ) is a pair of random variables such that L 2 — Li has the same law as L 1 . Then 
for every c > 0, in the regime (1), we have 

lim n _1 Cp (n -, =0 in probability, 

where C^ n ^ denotes the size of the largest percolation cluster which does not contain the root 0. 

Proof: Recall that V\ and V 2 denote two independent uniformly distributed random vertices. 
Plainly, the probability g(n) that V 1 and V 2 both belong to the same percolation cluster and 
are disconnected from can be bounded from below by (n + l)^ 2 ¥,(\C^ n ^\ 2 ) 

On the other hand, g(n) is bounded from above by the probability that at least one edge of 
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the branch from the root to the branch-point V\ A V 2 of V\ and V 2 has been removed, viz. 
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+ l)- 2 E(|C p V)! 2 ) < <?(«) < 1 - E (pW^^)) , (6) 



where d n denotes the graph distance in T n . 
Next, write 

L 2 , n = d„(0, V 1 ) + d n (0, V 2 ) - d n (0, Vi A V 2 ) . 

Since L l n = d n (0, Vi) has the same law as d n (0, V 2 ), it follows from our assumption that the 
sequences £(n) _1 d n (0, V 2 ) and l(n)~ l (d n (0, V 2 ) — d n (0, Vi A V 2 )) converge weakly to the same 
distribution. This readily implies that 

d n (0, Vi A V 2 ) = o(£(n)) in probability, 

and we conclude that the right-hand side in (6) tends to as n — > 00. □ 



3 Almost giant clusters 

In this section, we turn our attention to the percolation clusters which do not contain the root. 
We write 

Cp(n) > Cp(n) > 

for the sequence of their sizes, ranked in the decreasing order 1 . A natural problem is then to 
determine the asymptotic behavior of this sequence. We first point out that Hypotheses (H k ) 
are insufficient to characterize the latter, by considering three simple examples in which very 
different behaviors can be observed. 

First, imagine that T n is a star-shaped tree centered at 0, meaning that the root is the unique 
branching point. Suppose also for simplicity that there are ~ branches attached to the 
root, each of size ~ n a , where a G (0, 1) is some fixed parameter. Then one readily checks that 
(Hk) and (H' k ) hold for every k > 1 with £(n) = n a and L k = £1 + • • • + £ fc where the & are 
i.i.d. uniformly distributed on [0, 1]. It is further straightforward to see in the regime (1), one 
has 

P{n) ^p(n) p(n) 

for every fixed j G N. 

Second, consider the case when T n is the complete regular <i-ary tree with height h, where 



1 Beware that this convenient notation may be sightly misleading, since Cp(„) is always the size of the cluster 
containing the root 0, while for i > 1, is in general not the size of the cluster containing the vertex i. 
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d > 2 is some integer. So there are d j vertices at distance j — 0, 1, . . . , h from the root and 

n = n(h) = d(d h - l)/(d- 1). 

One readily checks that Hypotheses (H k ) and (H' k ) hold for every k > 1 with £(n) = Inn and 
£t = 1/ In d. Because the subtree spanned by a vertex at height j < h is again a complete 
regular d-axy tree with height h — j, we deduce from the preceding section that in the regime 
(1), the size of the largest cluster which does not contain the root is close to 

e - c /ind d h-K(h)+i^ d _ 1 ^ 

where k(H) is the smallest height at which an edge has been removed. Recall that there are 
d(<# — l)/(d — 1) edges with height at most j, so the law of n(h) is given by 

F( K (h) > j) = p ( n y(^-i)/(d~i) ; j = l,...,h. 

It follows readily that in the regime (1), the sequence - : h e N) is tight. We stress 

however that this sequence does not converge in distribution as h — > oo; more precisely weakly 
convergent subsequences are obtained provided that the fractional part {|^} converges. It 
follows that the sequence (^C^ : n = n(h),h G is also tight. It does not converge as 
h — > oo; however weakly convergent subsequences can be excerpt provided that {]^} converges. 

Third, we recall that in the case of Cayley trees, Pitman [11, 12] showed that for 1 —p(n) ~ 
c/ ' y/n with a fixed c > 0, the sequence of the sizes of the clusters ranked in decreasing order 
and renormalized by a factor 1/n converges weakly as n — > oo to a random mass partition 
which can be described explicitly in terms of a conditioned Poisson measure. It is interesting 
to observe that in this situation, the number of giant components is unbounded as n — > oo. We 
stress that the conditions of Proposition 1 and the hypotheses (H' k ) for k > 2 fail for Cayley 
trees. 

We shall now study the asymptotic behavior of the sizes of the largest clusters which do not 
contain the root for two families of random trees with logarithmic heights, i.e. which fulfill 
(Hk) with £(n) = Inn. In particular, we shall point out that in the regime (1), the largest 
percolation clusters which do not contain the root fail to be giant only by a logarithmic factor. 

3.1 Random recursive trees 

A tree on an ordered set of vertices is called recursive if, when we agree that the smallest vertex 
serves as the root, then the sequence of vertices along any branch from the root to a leaf is 
increasing. Recursive trees are sometimes also known as increasing trees in the literature; they 
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arise for instance in computer science as data structures, or as simple epidemic models. 

Of course, there is no loss of generality in assuming that the set of vertices is V n = 
{0, 1, . . . ,n} (and then is the root); however other ordered sets may arise naturally in this 
setting as we shall see. Each recursive tree on V n encodes a permutation of {1, . . . , n} in such 
a way that the subtrees attached to the root correspond to the cycles of the permutation, 
and this encoding is bijective; see Section 6.1.1 in [5]. In particular, there are n! recursive trees 
on V n ; we pick one of them uniformly at random and denote it by T n . In other words, T n can 
be viewed as a Cayley tree on V„, subject to the condition that the sequence of vertices along 
any branch from the root to a leaf is increasing. We stress that, informally, the conditioning 
becomes singular as n — > oo. Indeed the geometry of large Cayley trees and large uniform 
recursive trees are notoriously different; for instance the typical height of the former is of order 
y/n while that of latter is only of order Inn. 

There is an elementary algorithm for constructing T n which is closely related to the so-called 
Chinese restaurant process (see, e.g. Section 3.1 in Pitman [12]), and hence further points at 
the connexion with uniform random permutations. For every % — 1, . . . , n, we pick a vertex Ui 
uniformly at random from {0, . . . ,i — 1} and independently of the Uj for j ^ i. The random 
tree induced by the set of edges {(i, Ui) : i — 1, . . . , n} is then a version of T n . 

Uniform recursive trees fulfill an important splitting property which is the key to many of 
their features. Fix an arbitrary k e {1, . . . , n} and remove the edge between k and its parent 
Uk- This disconnects T n into two subtrees, say T and T'. If we denote by V (respectively, V) 
the sets of vertices of T (respectively, of T"), then conditionally on V and V', T and T' are 
two independent uniform recursive trees with respective sets of vertices V and V. This basic 
property is easy to check, either directly from the definition, or from the Chinese restaurant 
construction of T n . 

It is easy to verify that the conditions (H k ) are fulfilled for all k > 1 with £(n) = Inn and 
Lj, = k; see Section 6.2.5 in [5]. We conclude from the preceding section that in the regime 
(1), the cluster containing the root is the unique giant percolation cluster of T n , and more 
precisely that (5) holds with 9(c) = e~ c . The main result of [3] is that the next largest clusters 
are almost giant, and more precisely, one has the following weak limit theorem. 

Theorem 2 Let T n denote a uniform random recursive tree on {0, 1, . . . ,n}. For every fixed 
integer j , in the regime (1) with £(n) = Inn, 




converges in distribution towards 



(Xi,...,Xj) 
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where xi > x 2 > . . . denotes the sequence of the atoms of a Poisson random measure on (0, oo) 
with intensity ce~ c x~ 2 dx . 

There is an equivalent simple description of the law of the limiting sequence, namely 1/xi, 
l/x 2 — 1/xi, . . . , 1/xj — l/xj_i are i.i.d. exponential variables with parameter ce~ c . In particular 
Xj has the same distribution as the inverse of a gamma variable with parameter (j, ce~ c ), and 
Hindoo jXj = ce _c in probability. 

The basic idea in [3] for establishing Theorem 2 is to relate percolation on a rooted tree 
T to a random algorithm for the isolation of its root that was introduced Meir and Moon. 
Specifically, following these authors, we can imagine that we remove an edge in T uniformly at 
random, disconnecting T into two subtrees. We set aside the subtree which does not contain the 
root and iterate in an obvious way with the subtree containing the root, until the root is finally 
isolated. Loosely speaking, we can think of this algorithm as a dynamical version of percolation 
(i.e. edges are now removed one after the other rather than simultaneously), except that each 
time an edge is removed, the cluster which does not contain the root is instantaneously frozen, 
in the sense that only edges belonging to the cluster that contains the root can be removed. 

The upshot of this point of view is that it enables us to use a coupling due to Iksanov and 
Mohle [8], which, informally, identifies the sequence of the sizes of the frozen subtrees which 
arise from the isolation of the root algorithm, with the sequence rji, r) 2 , ■ ■ ■ , rjk of i.i.d. variables 
with distribution 

JU + i) 

at least as long as rji + ••• + %< n. In short, this coupling follows from the splitting property 
of random recursive trees, and the following remarkable fact observed by Meir and Moon [10]. 
Imagine that we remove an edge of T n uniformly at random, and consider the size of the 
resulting subtree that does not contain the root. Then the latter has the same distribution as 
i] conditioned on rj < n. 

The coupling of Iksanov and Mohle enables us to use Extreme Values Theory and determine 
the asymptotic behavior of the sizes of these frozen subtrees, jointly with the steps of the algo- 
rithm at which they have appeared. In short, one finds that the largest frozen sub-trees have 
size of order n/ Inn and a precise limit theorem can be given in terms of the atoms of some 
Poisson random measure. It then remains de-freeze each of these subtrees by performing an 
additional Bernoulli percolation with a suitable parameter, to recover the outcome of perco- 
lation on T n . Roughly speaking, each of these frozen subtrees can be viewed conditionally on 
its size as a uniform recursive tree. As a consequence, the additional percolation produces a 
single relatively giant component of size again of order n/\nn and further clusters of smaller 
size O (n/ln 2 n). In particular, the largest percolation clusters of T n which do not contain the 
root correspond to simple transformations of the frozen subtrees arising from the algorithm of 
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isolation of the root, and their limiting distribution is obtained as the image of some Poisson 
random measure. 



3.2 Scale-free random trees 

Scale-free random trees form a one-parameter family of random trees that grow following a 
preferential attachment algorithm; see [2]. Fix a parameter (3 G (— l,oo), and start for n = 1 
from the unique tree on Vi = {0, 1} which has a single edge connecting and 1. Then 
suppose that has been constructed for some n > 1, and for every i e V n = {0, . . . , n}, 
denote by d n (i) the degree of the vertex i in . Conditionally given we construct the 
tree T^+\ by incorporating the new vertex n + 1 to and adding an edge between n + 1 and 
a vertex t> n e {0, . . . , n} chosen at random according to the law 

Recall that there is the identity Y17=o d n {i) — 2n (because is a tree with n + 1 vertices), 
so the preceding indeed defines a probability on {0, . . . , n}. Note also that when one let (3 — > 
oo, then v n becomes uniformly distributed on {0, . . . ,n}, and the algorithm yields a uniform 
recursive tree as in the preceding section. 

Just as for recursive trees, one can check that the conditions are fulfilled for all k > 1 
with £(n) = Inn and = k(l + /3) / (2 + /3); see for instant Section 4.4 in [6] in the case /3 = 0. 
Hence we know from Theorem 1 and Proposition 1 that percolation in the regime (1) produces 
a single giant cluster, and more precisely that (2) holds with G(c) = 9(c) = e _c ( 1+/3 - ) ^ 2+ ^. It 
has been shown recently in [4] that asymptotic behavior of the sizes of the largest clusters for 
percolation on a scale-free tree is similar to that on a random recursive tree. 

Theorem 3 Let T n = Tn^ denote a random scale free tree on {0,1,..., n) with parameter 
f3 > —1. For every fixed integer j , in the regime (1) with £(n) = Inn, 

Inn ! Inn j 

— C P (-)'•••'— 

converges in distribution towards 

(xi, . . . , Xj) 

where Xi > x 2 > . . . denotes the sequence of the atoms of a Poisson random measure on (0, oo) 
with intensity ce _c ( 1+/3 ) // ( 2+/3 ^x _2 da; . 

The key splitting property of random recursive trees fails for scale-free random trees, and the 
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approach in [4] for establishing Theorem 3 thus departs significantly from that for Theorem 2. 
In short, one superposes Bernoulli bond percolation to the growth algorithm with preferential 
attachment as follows. Each time an edge is inserted, we draw an independent Bernoulli variable 
e with parameter p(n). If e = 1, the edge is left intact, otherwise we cut this edge in two at 
its mid-point. The upshot of cutting rather than removing edges is that the former procedure 
preserves the degrees of vertices, where the degree of a vertex is defined as the sum of the 
intact edges and half-edges attached to it. This is crucial for running the construction with 
preferential attachment. 

This enables us to adapt a classical idea in this area (see, e.g. [6]), namely to consider a 
continuous time version of the growth algorithm with preferential attachment and interpret 
the latter in terms of a continuous time branching processes. Roughly speaking, incorporating 
percolation to the algorithm yields systems of branching processes with rare neutral mutations, 
where a mutation event corresponds to the insertion of an edge that is cut in its mid-point. Each 
branching process in the system corresponds to a percolation cluster which grows following a 
dynamic with preferential attachment. One has to study carefully the asymptotic behavior of 
such systems of branching processes with neutral mutations, and then derive Theorem 3. 

Acknowledgments. I would like to thank David Croydon for a question that he raised during 
the workshop Random Media 77 at Tohoku University in September 2012, which has motivated 
Section 2 of the present text 
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